Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

risc-v/mpfs: i2c: perform sanity checks #191

Merged
merged 1 commit into from
Dec 19, 2023
Merged

Conversation

eenurkka
Copy link

@eenurkka eenurkka commented Dec 5, 2023

Replace risky DEBUGASSERT()s with real sanity checks. Also, do a few more checks as the system might occasionally fire an interrupt shortly after a warm reboot.

Yet, modify i2c_transfer() function so that up_disable_irq() is always called at the end to better prevent ill-timed irqs.

Summary

Prevent irqs after a warm boot

Impact

Testing

Saluki-v2 / AMP mode

@eenurkka eenurkka requested review from pussuw and jlaitine December 5, 2023 11:05
@pussuw
Copy link

pussuw commented Dec 5, 2023

Did you catch where the interrupt actually fires?

@eenurkka
Copy link
Author

eenurkka commented Dec 5, 2023

Did you catch where the interrupt actually fires?

No (or actually there was a similar bug which I tried to reproduce with no success)

@pussuw
Copy link

pussuw commented Dec 5, 2023

Should I test this patch and try to reproduce the issue ? It can easily be seen in kernel mode due to the illegal access trap.

@eenurkka
Copy link
Author

eenurkka commented Dec 5, 2023

Should I test this patch and try to reproduce the issue ? It can easily be seen in kernel mode due to the illegal access trap.

That would be more than welcome, thanks!

Copy link

@jlaitine jlaitine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question, the rest looks good

* reboot. Try to detect that with some sanity checks.
*/

if (priv == NULL)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would the irq trigger after warm boot? Mss is reset, but what happens with the fpga? Maybe the fabric should be reset at warm boot as well?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would the irq trigger after warm boot? Mss is reset, but what happens with the fpga? Maybe the fabric should be reset at warm boot as well?

Still working with different build issues, as to verify whether this patch fixes any..

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it's an irq that comes very early for some reason. This patch makes it no longer crash, but will not ack the irq, so the system keeps flooding the irq giving no time for anything else; so the system appears stuck.

Looks like the ack is needed, and a few unnecessary checks may be avoided. Then it becomes stable; will do more testing.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect the corei2c irq line is asserted when we do the soft reset. When the interrupt source is enabled again, it will fire immediately and unexpectedly.

Is there a way to explicitly clear the interrupt on the corei2c device block ? Or explicitly reset the corei2c device ?

The way we do a soft reset is simply by:

j __start

So a reset signal is not emitted on the soc.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the irq is enabled right before issuing the START condition, and disabled at the end of the transaction. Maybe I could try clearing the irq before enabling it, right.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clearing the The Serial Interrupt flag before enabling i2c or clearing it before starting any transaction doesn't help any. Not sure that unnecessary interrupt can be avoided at all.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might there be a FIFO in the i2c device that needs to be emptied too ? With SPI at least the RX interrupt didn't de-assert before the FIFO was also reset.

This is strange indeed, to me it would make sense that something does not reset properly and if we do the "j start" operation at a bad time (when the i2c transaction is running) we get old / stale interrupts from the device still.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No FIFO, but data register. Could try read it out before enabling

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the time it's: "Data byte has been received; NACK has been returned." or MPFS_I2C_ST_RX_DATA_NACK 0x58, so it maybe reboots between a transaction

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also MPFS_I2C_ST_RX_DATA_ACK is present, I think if the system is 'rebooted' while in transaction, the peripheral device will continue the i2c transaction which eventually causes an unnecessary interrupt.

Replace risky DEBUGASSERT()s with real sanity checks. Also,
do a few more checks as the system might occasionally fire an
interrupt if the system has been restarted while in middle of
an i2c transaction.

Yet, modify i2c_transfer() function so that up_disable_irq()
is always called at the end to better prevent ill-timed irqs.

Signed-off-by: Eero Nurkkala <[email protected]>
Copy link

@pussuw pussuw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@eenurkka eenurkka merged commit 870bf22 into master Dec 19, 2023
6 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants